Introduction

Main focus of this side project is to find phase-transition which happen when we are changing boundary/Acceptability of different opinion and Narrowness of identity group. Result should be a simple graph showing how ESBG polarization changes with change of Acceptability of different opinion and Narrowness of identity group.

Note: Experiment is still running, we are at 26 complete sets of all values combinations out of 120. Also note, that it seems that we are not done yet, we probably will need more sets than 120 and more values of Narrowness of identity group.

Loading data

Data are at http://github.com/frantisek901/Spirals/Experiment. Experiment is still running and I, FranČesko, from time to time actualize the *.csv files at GitHub, then I run script experiment.R which loads the data. Now, 2022-03-25, we are at 20 %, roughly. Who is not interested in working with megabytes of *.csv files, might use compiled phase2w.RData.

Now we load and aggregate these data and factorize and rename selected variables:

## Loading stored data
load("phase2w.RData")

## Preparing individual data 'dfi'
dfi = phase2w %>% 
  ## Filtering variables:
  filter(RS <= 12 | (RS >= 61 & RS <= 74), identity) %>%  
  
  ## Changing some variables to factors:
  mutate(id_threshold = factor(id_threshold),
         boundary = factor(boundary),
         opinions = factor(opinions))


## Summarising 'dfi' into 'dfs':
dfs = dfi %>% 
  group_by(opinions, boundary, identity, id_threshold) %>% 
  summarise(ESBG = mean(ESBG)) %>% ungroup() %>% 

  ## Renaming variables according 2022-03-18 meeting:
  prejmenuj(1:4, c("Opinion dimensions:", "Acceptability of different opinion:", "Identity:", 
                    "Narrowness of identity group:"))

Graph

Now, let’s show our results graphically!

Color maps

dfs %>% 
  ggplot() +
  aes(x = `Acceptability of different opinion:`, fill = ESBG, label = round(ESBG, 2),
      y = `Narrowness of identity group:`) +
  facet_wrap(vars(`Opinion dimensions:`), ncol=1) +
  geom_point(alpha = 1, size = 13, shape = 22, col = "white") +
  geom_text(color = "white", size = 3) +
  scale_fill_gradient2(low = "green", mid = "red", high = "black", midpoint = 0.3) +
  labs(title = "Change of polarization in simulations by 'Opinion dimensions' (1, 2, 4),\n'Narrowness of identity group' (0.35--0.6, NA) and\n'Average acceptability of different opinions' (0.05--0.3)",
       x = "Average acceptability of different opinions", 
       caption = "Note: Value 'NA' in 'Narrowness of identity group' indicates that identity constraint is not used.") +
  guides(alpha = "none") +
  theme_minimal() +
  theme(legend.position = "top")  

  1. the more dimensions the less polarization,
  2. the least polarized region is quarter of circle in the right bottom corner, then the left-hand side stripe, and the most polarized is the resting region (the most polarized part of this region seems to move with change of dimensions, but may be it is the artifact of low number of simulations – sometimes it is upper-right corner, sometimes upper border, sometimes right border)

Pulped clouds

For the first graph on pulped clouds we aggregate Acceptability of different opinion into 13 categories (we just round 121 original values to 2 digits). Two different levels of polarization are seeable here, but it doesn’t look like clouds…

## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
  filter(id_threshold %in% seq(0.36, 0.6, 0.03)) %>% 
  # sample_n(2000) %>%
  ## Selecting variables:
  select(opinions, boundary, id_threshold, ESBG) %>% 
  mutate(boundary = as.numeric(as.character(boundary))) %>% 
  
  ## Renaming variables according 2022-03-18 meeting:
  prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",  
                    "Narrowness of identity group:")) %>% 

  ## Graph itself:
  ggplot() +
  aes(x = `Acceptability of different opinion:`, y = ESBG, 
      fill = `Narrowness of identity group:`,
      col = `Narrowness of identity group:`, 
      group = `Acceptability of different opinion:`) +
  facet_wrap(vars(`Narrowness of identity group:`, `Opinion dimensions:`), ncol=3) +
  geom_boxplot(alpha = 0.2) +
  geom_jitter(alpha = 0.2) +
  scale_x_continuous(breaks = seq(0.05, 0.30, 0.05)) +
  labs(title = "Change of polarization in simulations by 'Opinion dimensions' (1, 2, 4),\n'Narrowness of identity group' (0.36--0.6) and 'Average acceptability of different opinions' (0.05--0.3)",
       x = "Average acceptability of different opinions", y = "Polarization") +
  theme_minimal() +
  theme(legend.position = "top")  

Now same graph, but with every value:

## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
  filter(id_threshold %in% seq(0.35, 0.6, 0.01)) %>% 
  # sample_n(2000) %>%
  ## Selecting variables:
  select(opinions, boundary, id_threshold, ESBG) %>% 
  mutate(boundary = as.numeric(as.character(boundary))) %>% 
  
  ## Renaming variables according 2022-03-18 meeting:
  prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",  
                    "Narrowness of identity group:")) %>% 

  ## Graph itself:
  ggplot() +
  aes(x = `Acceptability of different opinion:`, y = ESBG, 
      fill = `Narrowness of identity group:`,
      col = `Narrowness of identity group:`, 
      group = `Acceptability of different opinion:`) +
  facet_wrap(vars(`Narrowness of identity group:`, `Opinion dimensions:`), ncol=3) +
  geom_boxplot(alpha = 0.2) +
  geom_jitter(alpha = 0.2) +
  scale_x_continuous(breaks = seq(0.05, 0.30, 0.05)) +
  labs(title = "Change of polarization in simulations by 'Opinion dimensions' (1, 2, 4),\n'Narrowness of identity group' (0.35--0.6) and 'Average acceptability of different opinions' (0.05--0.3)",
       x = "Average acceptability of different opinions", y = "Polarization") +
  theme_minimal() +
  theme(legend.position = "top")  

Now, same data but slightly different graph

## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
  filter(id_threshold %in% seq(0.35, 0.6, 0.05)) %>% 
  # sample_n(2000) %>%
  ## Selecting variables:
  select(opinions, boundary, id_threshold, ESBG) %>% 
  mutate(boundary = as.numeric(as.character(boundary))) %>% 

  ## Renaming variables according 2022-03-18 meeting:
  prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",  
                    "Narrowness of identity group:")) %>% 


  ## Graph itself:
  ggplot(aes(x = `Acceptability of different opinion:`, y = ESBG, 
             fill = `Narrowness of identity group:`,
             col = `Narrowness of identity group:`, 
             group = `Acceptability of different opinion:`)) +
  facet_wrap(vars(`Narrowness of identity group:`, `Opinion dimensions:`), ncol=3) +
  geom_point(alpha = 0.15) +
  scale_x_continuous(breaks = seq(0.05, 0.30, 0.05)) +
  labs(title = "Change of polarization in simulations by 'Opinion dimensions' (1, 2, 4),\n'Narrowness of identity group' (0.35--0.6) and 'Average acceptability of different opinions' (0.05--0.3)",
       x = "Average acceptability of different opinions", y = "Polarization") +
  theme_minimal() +
  theme(legend.position = "top")

## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
  filter(id_threshold %in% seq(0.35, 0.6, 0.01)) %>% 
  # sample_n(2000) %>%
  ## Selecting variables:
  select(opinions, boundary, id_threshold, ESBG) %>% 
  mutate(boundary = as.numeric(as.character(boundary))) %>% 
  
  ## Renaming variables according 2022-03-18 meeting:
  prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",  
                    "Narrowness of identity group:")) %>% 


  ## Graph itself:
  ggplot(aes(x = `Acceptability of different opinion:`, y = ESBG, 
             fill = `Narrowness of identity group:`,
             col = `Narrowness of identity group:`, 
             group = `Acceptability of different opinion:`)) +
  facet_wrap(vars(`Narrowness of identity group:`, `Opinion dimensions:`), ncol=3) +
  geom_point(alpha = 0.15) +
  scale_x_continuous(breaks = seq(0.05, 0.30, 0.05)) +
  labs(title = "Change of polarization in simulations by 'Opinion dimensions' (1, 2, 4),\n'Narrowness of identity group' (0.35--0.6) and 'Average acceptability of different opinions' (0.05--0.3)",
       x = "Average acceptability of different opinions", y = "Polarization") +
  theme_minimal() +
  theme(legend.position = "top")

Regression

m = lm(ESBG ~ opinions+as.numeric(id_threshold)+as.numeric(boundary), data = filter(dfi, identity))
ms = summary(m)

p1 = lm(ESBG ~ opinions+id_threshold+boundary, data = filter(dfi, identity))
p1s = summary(p1)
p1s
## 
## Call:
## lm(formula = ESBG ~ opinions + id_threshold + boundary, data = filter(dfi, 
##     identity))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.44749 -0.04425 -0.00045  0.05690  0.43923 
## 
## Coefficients:
##                   Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)       0.300705   0.003400   88.453  < 2e-16 ***
## opinions2        -0.191067   0.001144 -167.039  < 2e-16 ***
## opinions4        -0.296721   0.001144 -259.406  < 2e-16 ***
## id_threshold0.36  0.004139   0.003367    1.229  0.21907    
## id_threshold0.37  0.010383   0.003367    3.083  0.00205 ** 
## id_threshold0.38  0.017604   0.003367    5.228 1.72e-07 ***
## id_threshold0.39  0.026516   0.003367    7.874 3.49e-15 ***
## id_threshold0.4   0.036228   0.003367   10.759  < 2e-16 ***
## id_threshold0.41  0.041148   0.003367   12.220  < 2e-16 ***
## id_threshold0.42  0.050682   0.003367   15.051  < 2e-16 ***
## id_threshold0.43  0.061519   0.003367   18.269  < 2e-16 ***
## id_threshold0.44  0.064611   0.003367   19.187  < 2e-16 ***
## id_threshold0.45  0.066940   0.003367   19.879  < 2e-16 ***
## id_threshold0.46  0.076807   0.003367   22.809  < 2e-16 ***
## id_threshold0.47  0.080585   0.003367   23.931  < 2e-16 ***
## id_threshold0.48  0.082837   0.003367   24.600  < 2e-16 ***
## id_threshold0.49  0.083915   0.003367   24.920  < 2e-16 ***
## id_threshold0.5   0.086817   0.003367   25.782  < 2e-16 ***
## id_threshold0.51  0.087339   0.003367   25.937  < 2e-16 ***
## id_threshold0.52  0.087116   0.003367   25.871  < 2e-16 ***
## id_threshold0.53  0.086267   0.003367   25.618  < 2e-16 ***
## id_threshold0.54  0.087524   0.003367   25.992  < 2e-16 ***
## id_threshold0.55  0.086397   0.003367   25.657  < 2e-16 ***
## id_threshold0.56  0.086641   0.003367   25.729  < 2e-16 ***
## id_threshold0.57  0.085483   0.003367   25.385  < 2e-16 ***
## id_threshold0.58  0.086064   0.003367   25.558  < 2e-16 ***
## id_threshold0.59  0.085587   0.003367   25.417  < 2e-16 ***
## id_threshold0.6   0.081950   0.003367   24.336  < 2e-16 ***
## boundary0.06      0.003127   0.003367    0.929  0.35306    
## boundary0.07      0.007391   0.003367    2.195  0.02819 *  
## boundary0.08      0.013873   0.003367    4.120 3.80e-05 ***
## boundary0.09      0.019130   0.003367    5.681 1.35e-08 ***
## boundary0.1       0.019350   0.003367    5.746 9.17e-09 ***
## boundary0.11      0.026455   0.003367    7.856 4.04e-15 ***
## boundary0.12      0.038873   0.003367   11.544  < 2e-16 ***
## boundary0.13      0.051628   0.003367   15.332  < 2e-16 ***
## boundary0.14      0.058907   0.003367   17.493  < 2e-16 ***
## boundary0.15      0.062217   0.003367   18.476  < 2e-16 ***
## boundary0.16      0.064484   0.003367   19.149  < 2e-16 ***
## boundary0.17      0.065194   0.003367   19.360  < 2e-16 ***
## boundary0.18      0.066237   0.003367   19.670  < 2e-16 ***
## boundary0.19      0.065460   0.003367   19.439  < 2e-16 ***
## boundary0.2       0.063153   0.003367   18.754  < 2e-16 ***
## boundary0.21      0.061926   0.003367   18.390  < 2e-16 ***
## boundary0.22      0.058444   0.003367   17.356  < 2e-16 ***
## boundary0.23      0.056337   0.003367   16.730  < 2e-16 ***
## boundary0.24      0.053622   0.003367   15.924  < 2e-16 ***
## boundary0.25      0.049731   0.003367   14.769  < 2e-16 ***
## boundary0.26      0.045521   0.003367   13.518  < 2e-16 ***
## boundary0.27      0.041772   0.003367   12.405  < 2e-16 ***
## boundary0.28      0.036063   0.003367   10.710  < 2e-16 ***
## boundary0.29      0.031794   0.003367    9.442  < 2e-16 ***
## boundary0.3       0.028178   0.003367    8.368  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1072 on 52675 degrees of freedom
## Multiple R-squared:  0.5877, Adjusted R-squared:  0.5872 
## F-statistic:  1444 on 52 and 52675 DF,  p-value: < 2.2e-16
f = lm(ESBG ~ opinions*id_threshold*boundary, data = filter(dfi, identity))
fs = summary(f)

I just wanna know how much variability we can explain by the full model. OK, we might explain 62.4 %, it means there is 37.6 % of variability, which is unexplainable in principle! Resp. we can’t explain it by any variable which we manipulated during simulation experiments. As I mentioned above, we might try explain it via detailed description of initial condition (however randomly generated) or via description of the course of the simulation.

BTW, the full model is not the best, fully factorised model with main effects only is the best (difference in BIC 1.66194^{4}), this model is better regarding the BIC than the non-factorized model with main effects (difference in BIC -1791.9). Just for order, the model with factorized main effects explained 58.8 % and the model with non-factorized main effects 56.9 % of variability.